99 research outputs found

    Super-Fast 3-Ruling Sets

    Get PDF
    A tt-ruling set of a graph G=(V,E)G = (V, E) is a vertex-subset SVS \subseteq V that is independent and satisfies the property that every vertex vVv \in V is at a distance of at most tt from some vertex in SS. A \textit{maximal independent set (MIS)} is a 1-ruling set. The problem of computing an MIS on a network is a fundamental problem in distributed algorithms and the fastest algorithm for this problem is the O(logn)O(\log n)-round algorithm due to Luby (SICOMP 1986) and Alon et al. (J. Algorithms 1986) from more than 25 years ago. Since then the problem has resisted all efforts to yield to a sub-logarithmic algorithm. There has been recent progress on this problem, most importantly an O(logΔlogn)O(\log \Delta \cdot \sqrt{\log n})-round algorithm on graphs with nn vertices and maximum degree Δ\Delta, due to Barenboim et al. (Barenboim, Elkin, Pettie, and Schneider, April 2012, arxiv 1202.1983; to appear FOCS 2012). We approach the MIS problem from a different angle and ask if O(1)-ruling sets can be computed much more efficiently than an MIS? As an answer to this question, we show how to compute a 2-ruling set of an nn-vertex graph in O((logn)3/4)O((\log n)^{3/4}) rounds. We also show that the above result can be improved for special classes of graphs such as graphs with high girth, trees, and graphs of bounded arboricity. Our main technique involves randomized sparsification that rapidly reduces the graph degree while ensuring that every deleted vertex is close to some vertex that remains. This technique may have further applications in other contexts, e.g., in designing sub-logarithmic distributed approximation algorithms. Our results raise intriguing questions about how quickly an MIS (or 1-ruling sets) can be computed, given that 2-ruling sets can be computed in sub-logarithmic rounds

    Super-Fast Distributed Algorithms for Metric Facility Location

    Full text link
    This paper presents a distributed O(1)-approximation algorithm, with expected-O(loglogn)O(\log \log n) running time, in the CONGEST\mathcal{CONGEST} model for the metric facility location problem on a size-nn clique network. Though metric facility location has been considered by a number of researchers in low-diameter settings, this is the first sub-logarithmic-round algorithm for the problem that yields an O(1)-approximation in the setting of non-uniform facility opening costs. In order to obtain this result, our paper makes three main technical contributions. First, we show a new lower bound for metric facility location, extending the lower bound of B\u{a}doiu et al. (ICALP 2005) that applies only to the special case of uniform facility opening costs. Next, we demonstrate a reduction of the distributed metric facility location problem to the problem of computing an O(1)-ruling set of an appropriate spanning subgraph. Finally, we present a sub-logarithmic-round (in expectation) algorithm for computing a 2-ruling set in a spanning subgraph of a clique. Our algorithm accomplishes this by using a combination of randomized and deterministic sparsification.Comment: 15 pages, 2 figures. This is the full version of a paper that appeared in ICALP 201

    On the Analysis of a Label Propagation Algorithm for Community Detection

    Full text link
    This paper initiates formal analysis of a simple, distributed algorithm for community detection on networks. We analyze an algorithm that we call \textsc{Max-LPA}, both in terms of its convergence time and in terms of the "quality" of the communities detected. \textsc{Max-LPA} is an instance of a class of community detection algorithms called \textit{label propagation} algorithms. As far as we know, most analysis of label propagation algorithms thus far has been empirical in nature and in this paper we seek a theoretical understanding of label propagation algorithms. In our main result, we define a clustered version of \er random graphs with clusters V1,V2,...,VkV_1, V_2,..., V_k where the probability pp, of an edge connecting nodes within a cluster ViV_i is higher than pp', the probability of an edge connecting nodes in distinct clusters. We show that even with fairly general restrictions on pp and pp' (p=Ω(1n1/4ϵ)p = \Omega(\frac{1}{n^{1/4-\epsilon}}) for any ϵ>0\epsilon > 0, p=O(p2)p' = O(p^2), where nn is the number of nodes), \textsc{Max-LPA} detects the clusters V1,V2,...,VnV_1, V_2,..., V_n in just two rounds. Based on this and on empirical results, we conjecture that \textsc{Max-LPA} can correctly and quickly identify communities on clustered \er graphs even when the clusters are much sparser, i.e., with p=clognnp = \frac{c\log n}{n} for some c>1c > 1.Comment: 17 pages. Submitted to ICDCN 201

    Using Read-k Inequalities to Analyze a Distributed MIS Algorithm

    Get PDF
    Until recently, the fastest distributed MIS algorithm, even for simple graphs, e.g., unoriented trees has been the simple randomized algorithm discovered the 80s. This algorithm (commonly called Luby's algorithm) computes an MIS in O(logn)O(\log n) rounds (with high probability). This situation changed when Lenzen and Wattenhofer (PODC 2011) presented a randomized O(lognloglogn)O(\sqrt{\log n}\cdot \log\log n)-round MIS algorithm for unoriented trees. This algorithm was improved by Barenboim et al. (FOCS 2012), resulting in an O(lognloglogn)O(\sqrt{\log n \cdot \log\log n})-round MIS algorithm. The analyses of these tree MIS algorithms depends on "near independence" of probabilistic events, a feature of the tree structure of the network. In their paper, Lenzen and Wattenhofer hope that their algorithm and analysis could be extended to graphs with bounded arboricity. We show how to do this. By using a new tail inequality for read-k families of random variables due to Gavinsky et al. (Random Struct Algorithms, 2015), we show how to deal with dependencies induced by the recent tree MIS algorithms when they are executed on bounded arboricity graphs. Specifically, we analyze a version of the tree MIS algorithm of Barenboim et al. and show that it runs in O(\mbox{poly}(\alpha) \cdot \sqrt{\log n \cdot \log\log n}) rounds in the CONGEST\mathcal{CONGEST} model for graphs with arboricity α\alpha. While the main thrust of this paper is the new probabilistic analysis via read-kk inequalities, for small values of α\alpha, this algorithm is faster than the bounded arboricity MIS algorithm of Barenboim et al. We also note that recently (SODA 2016), Gaffari presented a novel MIS algorithm for general graphs that runs in O(logΔ)+2O(loglogn)O(\log \Delta) + 2^{O(\sqrt{\log\log n})} rounds; a corollary of this algorithm is an O(logα+logn)O(\log \alpha + \sqrt{\log n})-round MIS algorithm on arboricity-α\alpha graphs.Comment: To appear in PODC 2016 as a brief announcemen

    Super-Fast MST Algorithms in the Congested Clique Using o(m) Messages

    Get PDF
    In a sequence of recent results (PODC 2015 and PODC 2016), the running time of the fastest algorithm for the minimum spanning tree (MST) problem in the Congested Clique model was first improved to O(log(log(log(n)))) from O(log(log(n))) (Hegeman et al., PODC 2015) and then to O(log^*(n)) (Ghaffari and Parter, PODC 2016). All of these algorithms use Theta(n^2) messages independent of the number of edges in the input graph. This paper positively answers a question raised in Hegeman et al., and presents the first "super-fast" MST algorithm with o(m) message complexity for input graphs with m edges. Specifically, we present an algorithm running in O(log^*(n)) rounds, with message complexity ~O(sqrt{m * n}) and then build on this algorithm to derive a family of algorithms, containing for any epsilon, 0 < epsilon <= 1, an algorithm running in O(log^*(n)/epsilon) rounds, using ~O(n^{1 + epsilon}/epsilon) messages. Setting epsilon = log(log(n))/log(n) leads to the first sub-logarithmic round Congested Clique MST algorithm that uses only ~O(n) messages. Our primary tools in achieving these results are (i) a component-wise bound on the number of candidates for MST edges, extending the sampling lemma of Karger, Klein, and Tarjan (Karger, Klein, and Tarjan, JACM 1995) and (ii) Theta(log(n))-wise-independent linear graph sketches (Cormode and Firmani, Dist. Par. Databases, 2014) for generating MST candidate edges

    Analysis of the Worst Case Space Complexity of a PR Quadtree

    Get PDF
    We demonstrate that a resolution-r PR quadtree containing n points has, in the worst case, at most nodes. This captures the fact that as n tends towards 4r, the number of nodes in a PR quadtree quickly approaches O(n). This is a more precise estimation of the worst case space requirement of a PR quadtree than has been attempted before

    Sample-And-Gather: Fast Ruling Set Algorithms in the Low-Memory MPC Model

    Get PDF
    Motivated by recent progress on symmetry breaking problems such as maximal independent set (MIS) and maximal matching in the low-memory Massively Parallel Computation (MPC) model (e.g., Behnezhad et al. PODC 2019; Ghaffari-Uitto SODA 2019), we investigate the complexity of ruling set problems in this model. The MPC model has become very popular as a model for large-scale distributed computing and it comes with the constraint that the memory-per-machine is strongly sublinear in the input size. For graph problems, extremely fast MPC algorithms have been designed assuming ??(n) memory-per-machine, where n is the number of nodes in the graph (e.g., the O(log log n) MIS algorithm of Ghaffari et al., PODC 2018). However, it has proven much more difficult to design fast MPC algorithms for graph problems in the low-memory MPC model, where the memory-per-machine is restricted to being strongly sublinear in the number of nodes, i.e., O(n^?) for constant 0 < ? < 1. In this paper, we present an algorithm for the 2-ruling set problem, running in O?(log^{1/6} ?) rounds whp, in the low-memory MPC model. Here ? is the maximum degree of the graph. We then extend this result to ?-ruling sets for any integer ? > 1. Specifically, we show that a ?-ruling set can be computed in the low-memory MPC model with O(n^?) memory-per-machine in O?(? ? log^{1/(2^{?+1}-2)} ?) rounds, whp. From this it immediately follows that a ?-ruling set for ? = ?(log log log ?)-ruling set can be computed in in just O(? log log n) rounds whp. The above results assume a total memory of O?(m + n^{1+?}). We also present algorithms for ?-ruling sets in the low-memory MPC model assuming that the total memory over all machines is restricted to O?(m). For ? > 1, these algorithms are all substantially faster than the Ghaffari-Uitto O?(?{log ?})-round MIS algorithm in the low-memory MPC model. All our results follow from a Sample-and-Gather Simulation Theorem that shows how random-sampling-based Congest algorithms can be efficiently simulated in the low-memory MPC model. We expect this simulation theorem to be of independent interest beyond the ruling set algorithms derived here

    Large-Scale Distributed Algorithms for Facility Location with Outliers

    Get PDF
    This paper presents fast, distributed, O(1)-approximation algorithms for metric facility location problems with outliers in the Congested Clique model, Massively Parallel Computation (MPC) model, and in the k-machine model. The paper considers Robust Facility Location and Facility Location with Penalties, two versions of the facility location problem with outliers proposed by Charikar et al. (SODA 2001). The paper also considers two alternatives for specifying the input: the input metric can be provided explicitly (as an n x n matrix distributed among the machines) or implicitly as the shortest path metric of a given edge-weighted graph. The results in the paper are: - Implicit metric: For both problems, O(1)-approximation algorithms running in O(poly(log n)) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model. - Explicit metric: For both problems, O(1)-approximation algorithms running in O(log log log n) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model. Our main contribution is to show the existence of Mettu-Plaxton-style O(1)-approximation algorithms for both Facility Location with outlier problems. As shown in our previous work (Berns et al., ICALP 2012, Bandyapadhyay et al., ICDCN 2018) Mettu-Plaxton style algorithms are more easily amenable to being implemented efficiently in distributed and large-scale models of computation
    corecore